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Abstract.  Many  engineering  applications  require  extracting  a  signal  from  observations 
corrupted  by  additive  noise,  possibly  heavy-tailed.  We  assume  that  the  observation  noise 
is  a  Levy  process,  while  the  signal  is  Gaussian,  and  derive  a  non-linear  recursive  filter 
that  minimizes  the  error.  A  sub-optimal  filter  is  proposed  for  numerical  purposes,  and 
simulations  show  that  it  out-performs  the  existing  linear  filter. 
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1  Introduction 


Estimation  problems  involving  non-Gaussian  noise  have  received  much  attention  in  recent 
years;  see  for  example  Kassam  (1988)  and  Nikias  and  Shao  (1995)  and  references  therein.  In 
this  paper  we  consider  filtering  of  Gaussian  signals  contaminated  by  independent  infinitely 
divisible  noise,  a  class  that  includes  compound  Poisson,  alpha-stable,  and  Gaussian  noises 
as  special  cases.  As  in  Nikias  and  Shao  (1995),  our  intent  here  is  to  develop  a  methodology 
that  works  in  the  presence  of  heavy-tailed,  possibly  infinite  variance,  noise. 

A  practical  application  in  which  the  observation  noise  can  be  represented  as  a  com¬ 
bination  of  a  Gaussian  noise  and  a  discrete  shock  is  the  tracking  error  in  code  tracking 
loops,  e.g.,  such  as  those  used  in  Global  Positioning  System  (GPS)  receivers.  The  tracking 
error,  defined  as  the  time  offset  between  the  received  signal  (plus  noise)  and  the  internally- 
generated  version  of  the  same  signal,  would  be  purely  Gaussian  except  that  occasionally 
the  tracking  slips  forward  or  backward  by  one  time  step,  introducing  a  discontinuity. 

Another  example  can  be  found  in  many  coherent  communication  systems,  where  the 
receiver  must  estimate  the  phase  of  a  carrier,  a  process  known  as  phase  tracking-,  see  Poor 
(1988),  p524.  It  is  customary  to  model  the  unknown  phase  as  a  Gaussian  process,  and  in 
some  applications,  it  is  also  appropriate  to  model  the  noise  and  interference  as  Gaussian. 
When,  however,  the  number  of  significant  interference  sources  is  small  and  when  interferers 
are  pulsed  or  hopped,  the  noise  and  interference  may  be  described  better  by  the  combination 
of  a  Wiener  process  and  a  jump  process. 

Filtering  problems  involving  heavy-tailed  noise  were  initiated  by  Stuck  (1978).  He  ex¬ 
amines  a  classic  set-up  for  Kalman  filtering  with  i.i.d.  plant  and  observation  noises,  each 
coming  from  the  same  symmetric  alpha-stable  distribution.  A  linear  recursive  filter  was 
chosen  to  satisfy  the  criterion  of  minimum  dispersion  of  the  prediction  error.  Cambanis 
and  Miller  (1981)  consider  among  other  linear  problems  the  linear  filtering  of  a  signal  in 
noise  when  both  signal  and  noise  belong  to  a  special  class  of  processes,  e.g.,  harmonizable. 

Their  solution  is  rather  abstract  as  it  is  written  in  terms  of  integrals  with  respect  to  sta¬ 

ble  measures.  Nikias  and  Shao  (1995)  consider  adaptive  Wiener-type  filters  constructed 
according  to  the  minimum  dispersion  criterion.  The  closest  to  our  work  is  the  paper  by 
Le  Breton  and  Musiela  (1993).  They  derive  a  linear  filter  for  a  continuous  time  system  in 
which  both  the  signal  and  observation  processes  are  given  by  linear  equations  with  respect 
to  general  semimartingale  noises  and  show  that  their  filter  minimizes  1/  error. 

Here  we  adopt  a  specific  class  of  noise  processes  and  exploit  analytic  properties  of  these 
processes.  More  specifically,  we  assume  that  the  signal  process  X  and  the  observation 
process  Y  evolve  according  to  the  following  stochastic  differential  equations: 

Xt  =  Xq  J  a{s)Xsds  -|-  b(^s'jd^,  (1) 

Yt  —  lo  +  /  c(s)Xsds+  f  d{s)dW^+  Jt,  (2) 

JO  JO 

where  B  and  W  are  independent  standard  Brownian  motions,  and  J  is  a  quadratic  pure 
jump  Levy  process,  that  is,  a  process  with  stationary  independent  increments  and  with 
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quaxiratic  variation  that  has  no  continuous  part.  See  Protter  (1990)  for  details;  in  partic¬ 
ular,  such  processes  are  independent  of  a  Brownian  motion.  The  coeflBicients  a,  b,  c,  d  are 
continuous  real  valued  functions  of  time.  We  further  assume  for  convenience  that  Xq  and 
Yq  are  constants. 

If  one  is  allowed  to  monitor  Y  continuously  in  time,  one  knows  exactly  when  each  jump 
occurs  and  can  extract  the  continuous  part  of  the  observation: 

Yt^  =  Fo  +  r  c{s)Xsds  d{s)dWt.  (3) 

Jo  Jo 

Thus,  to  estimate  the  signal  X,  one  would  apply  the  Kalman-Bucy  filter  to  the  F*^  extracted 
from  the  observation.  However,  a  more  realistic  assumption  would  be  that  the  observations 
arrive  in  discrete  times  0  <  ti  <  ^2  <  •  •  •  >  because  any  digital  signal  processing  system  can 
only  sample  at  a  finite  rate  limited  by  the  speed  of  an  analog-to-digital  (A/D)  convertor, 
and  can  only  process  observations  at  a  finite  rate  determined  by  the  available  processor 
power.  In  this  case  the  jump  can  not  be  removed  and  a  different  filter  is  required. 

This  paper  establishes  the  optimal  filter  based  on  observations  sampled  in  discrete  times, 
where  the  optimality  refers  to  minimizing  the  error,  that  is  distance  between  the  signal 
and  the  filter  output.  In  other  words,  we  find 

X,,=E\x,\Tt]  (4) 

where  -  =  o'(Fi)  -  -  •  >  F^-}  is  the  sigma  field  generated  by  observations  obtained  up  to  time 
tj.  Since  our  signal  is  Gaussian,  the  integrability  assumption  is  satisfied.  Note  that  we  do 
not  restrict  our  filter  to  be  linear.  In  fact  when  the  noise  is  strictly  non-Gaussian,  i.e.,  jump 
noise  J  is  present,  (4)  yields  a  non-linear  filter.  Although  a  linear  filter  is  more  tractable, 
it  could  produce  a  poor  estimate  in  the  presence  of  heavy  tailed  noise.  For  example,  when 
J  is  alpha-stable,  the  observation  process  has  an  infinite  variance.  Thus,  any  linear  filter 
would  have  an  infinite  error. 

In  the  remainder  of  this  introduction,  we  summarize  the  results  in  this  paper.  For 
simplicity,  A  denotes  an  increment  of  a  process;  for  example,  AF,-^j  =  F,,.,  -  F,.  To 
obtain  the  optimal  filter,  we  solve  a  two  stage  filtering  problem.  In  the  firet  stage,  we 
compute  a  pseudo  filter  X  by  assuming  that  we  are  able  to  recognize  the  proportion  of 
noise  contributed  by  J.  That  is,  Xt-  =  where  is  a  pseudo  filtration  that 

contains  the  information  of  {Jt^, . . . ,  Jtj)  as  well  as  i.e.,  Gtj  =  V 

Noting  that  {(Xt^.,  Fp,  j  =  0, 1, . . .}  is  jointly  normal  and  is  independent  of  J,  we  exploit 
the  innovation 

=  (5) 

which  is  a  sequence  of  independent  normal  random  variables,  to  obtain  a  linear  recursive 
equation  that  defines  X: 

(«) 
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for  suitable  coefficients  Xj,  Pj,  and  rjj.  We  calculate  these  coefficients  in  Propositions  2.1 
and  2.2  of  Section  2. 

The  second  stage  consists  of  filtering  the  pseudo  filter  X  to  the  original  filtration  T.  In 
other  words, 

4  =  (7) 

which  is  valid  since  Qt^  contains  Tty  Unfortunately,  non-linearity  appears  in  this  procedure 
as  we  mentioned  earlier,  and  knowing  Xtj  is  not  enough  to  update  For  this  reason, 

we  first  find  the  conditional  characteristic  function  E  \Tt^ ,  which  can  be  updated 

recursively. 

Theorem  1.1.  Let  ^  be  the  standard  normal  density,  and  let  fj^i  be  the  characteristic 
function  of  Mtj^^  +  where  is  dehned  by  (5).  Then  IIj+i(0)  = 

satisfies  the  following  recursive  equation: 

HhiW  J  =  (8) 

v^,^(A,cr,+,e)  J + 

where  oj+i  =  Var(Mf^.^j)  and  7^  =  Xj^{Pj  -  rjj);  Xj,pj,  and  rjj  are  as  in  (6). 

The  proof  of  the  above  theorem  is  given  in  Section  3.  Since  is  a  A^(0,  random 

variable  independent  of  AJtj^^  one  obtains  fj+i{(n)  as  the  product  of  exp(-aj^ia;^/2)  and 
the  characteristic  function  of  AJtj^y  (See  Feller  (1971),  Vol.  2,  for  the  characterization  of 
the  infinitely  divisible  characteristic  functions  as  well  as  examples.)  Differentiating  Hj+i{6) 
with  respect  to  9  and  evaluating  at  0,  one  obtains  the  optimal  filter: 

Corollary  1.1.  Let  i7j*(w)  =  and  ^tj+i  sat¬ 

isfies  the  following  equation: 

(9) 

(8)  and  (9)  describe  the  optimal  filter  recursively.  Unfortunately,  evaluating  these  equa¬ 
tions  may  require  considerable  computation.  For  each  j,  one  must  evaluate  Hj  at  densely 
sampled  0  to  update  Hj+i,  and  this  requires  evaluating  Fourier  integrals  as  many  as  the 
sample  size  of  9.  Consequently  this  method  may  be  impractical  when  observations  arrive 
frequently.  Because  of  these  problems,  we  propose  an  alternative  filter  that  we  obtain  by 
approximating  and  Hf  in  (9).  It  can  be  shown  that  7,-  =  0(tj+i  -  tj)  as  -  tj  -4  0, 
and  hence,  for  small  tj+i  -  tj  we  may  adopt  the  following  approximations: 

i7;(0)  ~  and  iTf  (0)  ~  (10) 
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where  Xtj  is  the  output  at  time  tj  of  an  approximate  filter.  Substituting  (10)  into  (9)  and 
applying  the  Fourier  inversion,  we  obtain  another  filter  X  that  can  be  obtained  from  the 
following  recursion: 


X,s„  =  -  7,4)  +  .  (11) 

It  turns  out  that 

+  AJ.,„  =  !,] 

is  the  minimum  variance  unbiased  estimate  of  given  +  AJtj+i ,  while  - 

jjXtj  =  +  ^’^tj+i  •  Thus,  it  is  reasonable  to  expect  that  the  sub-optimal  filter  X 

defined  by  (11)  will  perform  well  as  long  as  'yj  is  small.  In  Section  4,  we  use  Monte  Carlo 
simulation  to  compare  the  performance  of  this  sub-optimal  filter  with  that  of  the  linear 
filter  proposed  in  Le  Breton  and  Musiela  (1993).  The  simulation  results  confirm  that  X 
performs  far  better  than  the  linear  filter  in  the  presence  of  sizable  jumps.  Moreover  the 
performance  of  X  is  robust  to  misspecified  noise  structures.  This  is  an  attractive  property; 
in  practice  it  is  difficult  to  determine  the  characteristics  of  the  Gaussian  and  jump  noises. 


2  Computation  of  coefficients 

In  this  section,  we  prove  (6)  and  as  a  by-product  obtain  formula  for  the  coefficients  that 
appear  in  (6)  and  in  Theorem  1.1,  namely:  A,  /?,  7,  rj,  and  c.  Recall  that  Qtj  is  the  sigma  field 
generated  by  1^^, ...  as  well  as  Jt^,  and  that  the  innovation  M  is  a  martingale 

difference  sequence  defined  by 

f^7«=AF4-E[A4„|ft,]  (12) 

with  Mt^  =  0.  Since  J  is  independent  of  F;[Ay^^jAi;^,...,  AF^'] , 

and  the  joint  normality  implies  that  is  independent  or  Qtj- 

Proposition  2.1.  Let  A{t)  satisfy  dA{t)  —  a(t)A{t)dt  with  A(0)  =  1,  and  dehne 

=  A{tj+i)IA{tj)  and  pj+i  =  A'^itj+i)  j^''^\{s)A{s)ds .  (13) 

Then  Mtj+,  =  AF^^^j  -  pj+i/3jXtj  and  Xtj+i  =  -f  ^jXt^  where 

Xj  =  Cov{Mt.^^ ,  Mt.^,  +  c{s)b{s)dB,  -  f'*"  d(s)dW,) /p,+i  Var  (m* .  (14) 

'f  tj  J  tj 

In  addition,  we  obtain  pj  =  Pj  —  XjPj^iPj  as  well  as  7,  =  PjPj+i- 
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(15) 


Proof.  Because  dA-'^{t)  =  -a{t)A-^{t)dt  and  dXt  =  a(t)X{t)dt  +  b{t)dBt,  integration  by 
parts  yields 

A-\t)Xt  =  Xo  +  r  A-^{s)bis)dB, . 

Jo 

In  the  sequel,  Ht A~^{t)Xt  has  independent  increments.  Therefore, 

=  c{s)A{s)H,ds+  dis)dWs 

^  Jtj  Jtj 

and  =  A~^{tj)Xtj  whenever  s>tj.  Prom  this,  we  obtain 

Mt, ,  1  =  /  c(s)A{s)Hsds+  d{s)dWs-pj+iPjXtj. 

Jtj  Jtj 

On  the  other  hand,  integration  by  parts  yields 

f''^\{s)Ais)Hsds  =  f''^\{s)A{s)ds-  t'^\{s)b{s)dB{s) 

Jtj  Jtj  Jtj 

=  Pj+iXtj^,  -  /  c{s)b{s)dB{s) . 

Jtj 


(16) 


Substituting  this  into  (16),  one  has 

=  —  K-«  +  f“*'  c(s)b(s)dB,  -  !“*'  d(s)dW.\  +  HiXt, .  (17) 

Note  that  the  first  term  in  (17)  is  independent  of  Qtj  while  PjXtj  is  measurable  with  respect 
to  Qtj.  Thus, 

Xu,,  =  —E\Mt,^,  +  c{s)b{s)dB,  -  d{s)dW,\Mt^^,]  +  pjXtj 

Pj+i  I-  Jtj  Jtj 

and  the  rest  follows  from  the  joint  normality. 

□ 


In  the  following  proposition,  we  find  a  recursive  expression  for  (14). 
Proposition  2.2.  Let  =  Yax{Mtj^,)  and  vj+i  =  E^{Xtj^,  —  Xt^+j)^]-  Then 


Xj 


Vj+i 


<Jj+i  +  ^j+i 
Pj+i(^j+i 


-  (lj^{s)[pj+i^^^-c{s)^  +d^{sfjds  +  (pj+iPj)\ 

=  pjh  f  (c^(5)&^(s)  +  d^(s))  ds  —  Pjhoj^iil  —  pj+iXj)^ 

Jtj 


where 


r,+i  =  I’*'  {b'‘(s)c{s) -  <^(«)]  “  * ■ 


(18) 

(19) 

(20) 
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Proof.  Note  that 


/  c(s)A{s)Hsds  =  /  c{s)A{s)iHs-Ht,)ds+pj+,pjXt. 

J  ttj  J  tj 

=  c{s)A{s)ds-  c{s)b{s)dBsApj+iPjXt^ 

Jtj  Jtj 

where  the  last  equality  is  obtained  via  integration  by  parts  with  dHs  =  b{s)A~^{s)dBs. 
Then  (16)  can  be  rewritten  as 

-  c{s)]b{s)dB.  + 1‘‘*'  d{s)dW.  -  Xy) .  (21) 

Therefore  (19)  follows  immediately.  (21)  also  implies  that 

Cov(m,,^„  c{s)bis)dBs  -  d{s)dWs)  =  rj+i 
and  hence  (18)  follows.  Next,  subtracting  from  (17),  one  has 

Pi+i(^t,+i  -  ^f,-+i)  =  (1  -  V3+iXj)Mt.^^  +  /  c{s)b{s)dBs  -  /  d{s)dW, . 

•*/  tj  tj 

On  the  other  hand,  (14)  implies  that 

Cov c{s)b(s)dBs-  f  d{s)dWs)  = -  pj+iXj). 

•'tj  ^3 

Therefore 

Pj+iVj+i  =  f  (^(s)b^(s)  +  d^s))  ds  -  aj+i(l  -  Pj+iXjf 

•'tj 

and  we  obtain  (20). 

□ 


3  Proof  of  Theorem  1.1 

We  will  prove  Theorem  1.1  by  evaluating  Hj+i{6)  =  for  a  fixed  but  an 

arbitrary  j.  Both  and  depend  on  J-t^^  and  this  dependence  makes  it  difficult 

to  compute  the  conditional  expectation.  In  Gaussian  cases,  one  can  use  an  innovation 
process  to  remove  such  dependencies.  However  such  methods  generally  do  not  exist  for  non- 
Gaussian  noise.  Instead,  we  find  another  probability  measure  which  makes  ( ,  A  ) 
independent  of  Gtj  («Lnd  hence  independent  of  J-tj).  Throughout  this  section,  P  donotes 
the  probability  measure  associated  with  our  model.  The  following  lemma  is  parallel  to 
Girsanov’s  theorem: 
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Lemma  3.1.  DeGne  a  probability  measure  Q  =  Qj+i  via 


dP 

dQ 


-  L  -  exD f  "  AV  ^ 
^i+1 


2  a|+i 


)• 


(22) 


Then  the  law  of  {M^ , . . . ,  Mt. ,  AY^.^^ )  under  Q  is  the  same  as  the  law  of  {Mt^ , .  - . ,  ) 

under  P.  In  particular,  AY^.^^  is  independent  of  Qt- . 


Proof.  Note  that 


r-l  {  Tj^tj  ^ 


<^;+i  2  <t|+. 


is  a  positive  random  variable  of  mean  1  with  respect  to  P,  and  hence  Q  is  well  defined.  Also 
note  that,  under  P,  “  Ij^tj  is  a  A^(0,  ^j+l)  random  variable  independent 

of  Qtj .  Therefore 

|6,,]  =  (23) 

almost  surely,  and  the  result  follows. 

□ 

In  what  follows,  Eq  denotes  the  expectation  under  Q,  while  Ep,  or  simply  E,  is  used 
for  the  expectation  under  P.  We  will  use  the  following  version  of  Bayes’  rule: 

(tf)  ■  [  i,,„  I  ]  =  £«  [  (24) 

(See  Karatzas  and  Shreve  (1987),  pl93.)  The  next  step  is  to  prove  the  following: 

]  •  /j«(Ay;,„)  =  (26) 

Then  Eq^Lij^.^  |  Etj+i]fj+i{^ytj+i)  can  be  obtained  by  evaluating  (25)  at  0  =  0,  and  hence 
(25)  completes  the  proof  of  Theorem  1.1. 

Lemma  3.2.  Let  Y  =  W  +  Z]  where  W  is  a  N{0,a^)  random  variable  independent  of  Z. 
Then 

f(y).B[e<^'^exp{^W-~)\Y  =  y]  =  ^<Hio)  //(»)e-''’‘"e-«»-')"d..  (26) 


where  f  is  the  density  of  Y,  f  is  the  corresponding  characteristic  function,  and  4>  is  the 
standard  normal  density. 
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Proof.  Note  that  the  Fourier  transform  of  /(y)  £^fe^^expf-^TF  —  \y 

L  ^  \  (j2,  2  CF^  F  \ 

[/  CC  1  \  1 

gxp^_j/{/  _  __  j  which  further  reduces  to 
E[e^‘^^]  •  E[e*(‘"+«)'^exp(^T'F  -  ]  =  /(w)  .  gi(u;+Ox-|( 


y 


(27) 


The  inverse  Fourier  transform  of  (27)  is  identical  to  the  right  side  of  (26),  and  the  result 
follows. 

□ 

Now  we  resume  the  proof  of  Theorem  1.1.  Recall  that  +r)jXtj,  where 

T]j  —  (3j  —  Xjjj.  Since  Xtj  is  -measurable,  we  obtain 


V,  I  |e,,,  AV,.,,  )  |  ]  .  (28) 

Under  Q,  is  a  JV(0,<r|^]^)  random  variable  independent  of  ^Jtj+i  and  fj+i  is  the 

density  of  Therefore,  (26)  and  (28)  yield 

|ft,,  Ay,,„  ] .  /,+,(Ai),^.)  =  (29) 

y2'K  J 

The  last  step  is  to  compute  the  conditional  expectation  of  (29)  with  respect  to  F'tj+i  • 
Note  that  Ait^.^j  is  .Ft^^j-measurable  and 

EQ[e^(~'rju>+0je)Jctj  |  +  fijO)  . 

(This  last  is  a  consequence  of  the  above  lemma.)  Therefore  the  conditional  expectation  of 
(29)  with  respect  to  ^tj+i  yields  (25),  and  the  proof  is  complete. 


4  Numerical  Results 

Because  the  optimal  filter  X  requires  excessive  computation,  we  proposed  the  sub-optimal 
filter  X  defined  in  (11).  Provided  that  observations  arrive  frequently,  the  performance  of 
X  is  expected  to  be  near  by  optimal.  In  this  section,  we  show  simulation  results  in  which 
X  is  compared  with  the  best  linear  filter  obtained  by  Le  Breton  and  Musiela  (1993).  We 
also  provide  simulation  results  that  show  how  X  performs  when  the  correct  recognition  of 
the  noise  structure  fails. 

In  this  section,  we  consider  observations  that  arrive  at  a  fixed  rate,  or  equivalently  the 
arrival  time  of  the  j-th  observation  is  j6,  and  the  last  arrival  time  is  denoted  by  T.  The 
coefficients  of  the  signal  process  (1)  and  the  observation  process  (2),  namely  a,  6,  c,  and  d, 
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are  set  to  be  identically  1,  and  the  signal  at  time  0  is  also  1.  In  this  case,  it  turns  out  that 
/3j  =  =  (3j  -  1,  and 


<^j+i  — 


— 

Vj+l  = 


iPj  -  1)^ 

2P] 


(Pi  -  1)^ 

Pi 


+  (5  +  {Pj  —  l)^Vj , 


^  +  oji,  [ft  -  1  -  2ft  (ft  -  1)-M] , 

P]{Pi  -  1)"^(25  -  P]o^+i) , 


while  Vo  =  0.  Thus  one  updates  aj+i,  Xj,  and  vj+i  in  turns.  We  take  J  to  be  a  symmetric 
alpha  stable  process.  Such  processes  have  been  used  for  modeling  heavy  tailed  noise;  see, 
e.g.,  Nikias  and  Shao  (1995).  More  precisely,  the  characteristic  function  of  Jt  is  given  by 


=  exp(-tCl^r) 

where  a  is  a  stable  index  and  C  is  a  dispersion  parameter.  Different  values  of  these  param¬ 
eters  will  be  used  depending  upon  the  purpose  of  simulations. 

To  implement  X  one  must  evaluate  a]^Jj+Jfj+u  but  no  closed  form  expression  for 

fj+i  is  available.  One  can,  however,  use  Fourier  inversion,  as  the  Fourier  transform  ftj  of 
fj+i  is  well  known.  In  order  to  reduce  round-off  errors,  we  use  the  standard  scale:  there  is 
an  appropriate  constant  Sj+i  >  0  such  that  the  law  of  (Mtj+i  +  is  the  same 

as  that  of  M  -1-  Sj+iJ  where  M  is  standard  normal  and  J  is  symmetric  alpha  stable  with 
dispersion  1.  Consequently, 


^{y)  =  ^j+i 

Jj+l 


^i+1 


where  hj+i  is  the  density  of  M  +  Sj+iJ.  The  fast  Fourier  transform  is  used  for  evaluating 
hj+i,  the  inverse  Fourier  transform  of  exp(-0^/2  - 


4.1  Comparison  between  X  and  Linear  Filter 

The  performance  of  X  is  compared  with  that  of  the  best  linear  filter  given  by  Le  Breton 
and  Musiela  (1993).  Three  different  stable  indices  (1.0,  1.4,  and  1.8)  were  taken,  while 
the  dispersion  parameter  is  set  to  be  1.  The  inter-arrival  time  of  observations  is  0.01  and 
the  expiry  T  =  10.  Le  Breton  and  Musiela’s  filter  minimizes  1/  error  (for  a  specified  value 
of  p  >  1)  among  all  linear  filters.  Thus  it  is  not  applicable  when  the  stable  index  is  1.0. 
p  =  1.1  was  used  in  the  other  cases. 

100, 000  Monte  Carlo  simulations  were  generated  to  estimate  1/  error  and  error,  and 
the  results  are  summarized  in  Table  1.  Our  filter  is  denoted  by  ‘AF’  while  LM  is  used  for 
Le  Breton  and  Musiela’s  filter.  For  comparison,  we  also  implemented  the  pseudo  filter  (X) 
using  the  same  simulation,  except  that  the  non-Gaussian  part  of  the  observation  noise  was 
omitted.  The  error  in  this  case  is  2.414,  and  so  is  not  substantially  better  than  that 
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of  X.  (In  fact,  the  error  of  the  psuedo  filter  sets  a  lower  bound  of  the  errors  of  any 
filter.)  Since  the  average  value  of  Xt  is  20959,  the  relative  error  of  X  is  quite  small. 
Also  X  substantially  outperforms  the  best  linear  filter  in  both  and  1/  errors.  As  the 
sample  variance  of  alpha  stable  distribution  diverges  in  the  order  of  where  n  denotes 

the  sample  size,  the  error  of  the  linear  filter  is  decreasing  in  o:  as  shown  in  the  table. 


1/  error 

_  _  _ 

L2 

error 

AF 

LMp 

AF 

LMp 

Of  =  1.0 

1.75636 

n/a 

4.14959 

n/a 

a=  1.4 

1.68752 

4.59625 

3.86310 

180.25826 

a  =  1.8 

2.11866 

4.17176 

5.81958 

28.17528 

Table  1:  Performances  of  X  (AF)  and  Le  Breton-Musiela  p  =  1.1  (LMp) 


4.2  Robustness 

Ideally,  filter  design  proceeds  from  complete  a  priori  knowledge  of  the  structure  of  the  ob¬ 
servation  noise.  The  required  parameters  can  be  estimated  from  the  history.  It  is  important 
to  check  the  sensitivity  of  the  filter  performance  to  departure  of  the  assumed  value  of  a 
from  the  true  value. 


AF  a  =  1.0 

AF  a  =  true 

Kalman 

a  =  1.0 

3.09941 

3.09941 

15903.76435 

a  =  1.4 

3.21928 

3.07577 

20.25297 

a  =  1.7 

3.19790 

3.00424 

3.02967 

P 

II 

bo 

3.19219 

2.99719 

2.63541 

Table  2:  Robustness  of  X  (AF  o:  =  1.0) 

Data  in  Table  2  show  that  X  is  robust  with  respect  to  uncertainty  about  the  value 
of  the  stable  index.  In  order  to  obtain  a  fair  comparison,  we  do  not  vary  the  dispersion 
parameter  which  we  set  0.1.  Observations  arrive  each  0.01  time  unit  and  the  expiry  T  =  5. 
As  before,  100,000  simulation  runs  were  produced  to  estimate  the  error.  We  focus  on  the 
performance  of  Cauchy  filter:  that  is,  X  was  designed  for  the  stable  index  1.0.  The  first 
column  shows  how  this  Cauchy  filter  performs  when  the  actual  stable  index  is  different 
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from  1.0.  Four  different  values  (1.0,  1.4,  1.7,  and  1.8)  of  stable  index  were  considered, 
and  the  performance  was  robust  to  this  misspecification.  The  second  column  shows  the 
performance  of  X  when  the  stable  index  a  is  correctly  specified.  As  shown  in  the  table, 
the  degradation  due  to  use  of  the  wrong  value  of  a  was  not  substantial.  The  last  column 
shows  the  performance  of  Kalman  filter.  The  average  of  Xt  was  148.4,  and  the  error 
for  the  pseudo  filter  was  2.414.  Thus  we  may  conclude  that  the  Cauchy  filter  estimates  the 
signal  accurately  as  well.  When  the  stable  index  is  near  2.0,  it  turns  out  that  the  Kalman 
filter  slightly  outperforms  X.  This  is  due  to  the  small  dispersion  parameter,  as  well  as  large 
stable  index,  which  makes  the  observation  noise  virtually  Gaussian. 


5  Summary  and  Remarks 

In  order  to  extract  a  Gaussian  signal  contaminated  by  an  infinitely  divisible  noise,  such 
as  compound  Poisson  or  alpha  stable,  we  constructed  the  filter  X  that  minimizes  the 
error.  Since  implementing  this  optimal  filter  requires  excessive  computation,  we  propose 
a  more  practical  filter  X  that  approximates  the  optimal  filter.  In  fact,  if  the  function 
^j+ifj+i/fj+i  defined  in  (11)  is  Lipschitz  uniformly  as  maxj  \tj+i  —  tj\  — >  0,  one  can  show 
that  X  —  X  converges  to  0  in  uniformly  on  any  compact  interval.  So  far,  we  have  not 
been  able  to  check  whether  a  specific  model  satisfies  this  assumption.  In  this  paper  we 
include  simulation  results  which  show  that  X  outperforms  the  existing  best  linear  filter, 
provided  that  observations  arrive  sufficiently  frequently.  We  also  show  that  the  performance 
is  insensitive  to  the  misspecification  of  the  observation  noise  distribution.  Although  the 
robustness  of  X  enables  us  to  obtain  an  acceptable  estimate  of  the  signal  even  with  an 
inaccurate  description  of  the  observation  noise,  a  sound  statistical  procedure  for  identifying 
the  infinitely  divisible  noise  would  still  be  of  considerable  benefit  and  should  be  investigated 
in  the  future. 

A  significant  limitation  of  the  results  in  this  paper  is  that  the  signal  is  restricted  to  be 
Gaussian.  In  many  applications  of  signal  detection,  it  is  inappropriate  to  describe  a  signal 
as  a  Gaussian  process.  Although  it  is  questionable  whether  the  optimal  filter  can  be 
obtained  as  a  recursive  algorithm,  a  tractable  sub-optimal  filter  may  be  available  and  could 
be  useful. 
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